home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Turnbull China Bikeride
/
Turnbull China Bikeride - Disc 2.iso
/
STUTTGART
/
DATABASE
/
ULTIMATE
/
a252efmt.txt
< prev
next >
Wrap
Internet Message Format
|
1993-08-09
|
15KB
From: pcolmer@acorn.co.uk (Philip Colmer)
Date: 26 Oct 90 12:55:54 GMT
Time to put this one to rest, methinks ...
When you format a hard drive, be it an ADFS drive or a SCSI drive, you can
choose between "Old map" and "New map" formats. To see which format your
hard drive is in, type
*ADFS:Map :4
(or :5, 6 or 7)
If the output says "new map", you've got a new map format.
Some facts about the new map format:
* this format has the ability to store smallish files within the space taken
on the disc by the directory. The maximum size of these files is the value
given to the LARGE FILE ALLOCATION UNIT, or LFAU.
* the LFAU is specified when you format the hard drive. Apart from
specifying the maximum size of a file which can be fitted within a
directory space (NB *NOT* the maximum size of a file which can be stored
on the disc), it is also related to the CAPACITY of the drive.
The larger the drive, the higher the LFAU needs to be in order for the
drive to be able to be formatted. The following is a table of disc sizes
versus possible LFAUs:
Capacity LFAU "Size" of a directory (approx)
======== ==== ==============================
Up to 124Mbytes 256 4K
Up to 249Mbytes 512 8K
Up to 499Mbytes 1024 16K
Up to 999Mbytes 2048 32K
The "up to" values are inclusive. You can, of course, specify a large
LFAU, but the side effect is that the larger the LFAU, the more disc space
a directory takes, hence the "wasted disc space".
This arrangement, therefore, allows files up to the LFAU to be stored on
the disc without taking up any more disc space, and without fragmenting
the disc too much.
* when saving files to a new format disc, FileCore will automatically try
and coalesce blocks to avoid fragmentation.
* to determine what value LFAU your drive has:
DIM block% 64
SYS"ADFS_DescribeDisc",":4",block%
P."LFAU is ";2^(block%?5)
Some facts about the old map format:
* each directory always takes 4Kbytes
* the disc is not automatically compacted.
Note that ALL of the above applies to any FileCore based filing system. This
includes ADFS and SCSIFS.
Hope this clears everything up.
--Fil.
From: nick@abccam.abcl.co.uk (Nick Reeves)
Summary: E FORMAT DESIGN DOCUMENT
Date: 26 Oct 90 16:53:50 GMT
Organization: Active Book Company Limited, Cambridge, UK
Briefly:
The large directory size is the price you pay for a disc structure
which almost always keeps files contiguously, but can use fragments if
it has to. The rest of the space is not always wasted as small files
in the directory can use it, and only have their length rounded up to
the sector size rather than the cluster size. They are also quick to
access as they are close to the directory and don't need to update the
free space map.
NEW DISC FILE STRUCTURE FOR RISC OS
===================================
MOTIVATION
----------
The aim of the new file structure is firstly to remove the following
restrictions of the original file structure.
1. files must be stored contiguously on disc giving "compaction required"
and "can't extend" errors
2. there was a maximum number of entries in the free space map giving
"map full" error
3. defective sectors could not be mapped out of floppy discs
However most file structures which overcome these (eg MS-DOS and UNIX) pay a
heavy penalty in performance for the following reasons. As files may be split
up into several pieces the information on disc allocation is greatly
increased. The structure(s) to keep track of free space and disc allocation
are typically too large to be kept in RAM for hard discs. Therefore the disc
allocation algorithms tend to be be designed to minimise scanning of these
structures, rather than minimising fragmentation. This greatly speeds up the
fragmentation of free space and files, as blocks are claimed and released.
Even if utilities exist to rationalise disc structure an unintelligent
allocation scheme quickly fragments things again. Fragmentation degrades
performance since the parameters of standard disc drives are such that the
time spent seeking between the fragments of a file will dominate the time
spent transferring data unless the fragments are greater than a track length
on average. Therefore if the unit of disc allocation is small fragmentation
soon reduces most fragments to a few units in length giving slow performance,
but if the allocation unit is big enough to give anything close to possible
performance the allocation units are so big that a large part of the disc
space is wasted in rounding up files to allocation units. These points led to
the following additional aims.
4. The disc allocation and free space structure(s) should be small enough to
be kept in RAM even for large hard discs.
5. The allocation strategies should be intelligent to slow the rate of
fragmentation and should not produce small fragments.
6. Long term fragmentation will still occur so the allocation strategy should
have the option of rationalising disc allocation, which should not produce
a noticeable delay to the user.
STRUCTURE OVERVIEW
------------------
The disc structure is made up of the following parts
1. (hard discs only) A boot block containing the defect list and other
information
2. A double copied map which allows you to deduce for any cluster (unit of disc
allocation) whether it is allocated and if so to what position in which file
3. A hierarchical tree of directories
4. The remaining space is used for (fragments of) files or is unallocated
1 and 3 are the same as for the old file structure (see OldMap document)
except that disc addresses of files and directories are replaced by system
internal numbers (SIN). The SIN includes a file number which can be looked up
in the map to find the allocated disc space.
NEW MAP STRUCTURE
-----------------
Most of the map is bitstream, with each bit in the map being associated with
the allocation of a fixed number of bytes of disc space. This size (called
the bit size) is always a power of two. The cluster size is the maximum of
the bit size and the sector size. The map is divided into zones by the natural
sector boundaries. Floppies are limited to one zone in the present
implementation. The structure of a zone is as follows
start byte
byte length use
0 1 this is a checksum on this zone calculated as below
1 2 offset in bits to first free space in zone, or 0 if none, with
top bit always set
3 1 this byte when EORed with values for other zones should give FF
4 60 (only zone 0) the first 32 bytes are a disc record describing
the disc, the other 28 bytes are 0, being reserved.
The rest of the zone is the bitstream. All entries in the bitstream have the
same format. The start of an entry is an id field (of width given in the disc
record), the end is marked by a 1, and any other bits are set to 0. The
bitstream contains the following three types of object with their respective
interpretation of the id field.
OBJECT TYPE ID VALUE
a) (fragments of) files file number (minimum value 2)
b) unallocated fragments bit offset of next free fragment in zone or 0
c) disc defects always 1
When searching for the fragments of a file zone containing the first fragment
can be calculated from
zone = file number DIV ids per zone
Where ids per zone = map bits in zone DIV (width of id field + 1)
An exception is the file number 2. This is reserved for the structure which is
placed on an empty disc (boot block, map and root directory). Searching for
file number 2 should start at zone = total zones DIV 2. This is to allow the
root directory and map of hard disc to be placed near the middle.
The rest of the file's fragments (if any) can be found by searching through
the zones in ascending order, wrapping back to zone 0 if necessary, looking
for fragments with the correct file number. However because of the allocation
strategies given below it is rare for a file to be in more than one piece
unless it has been extended or is very large.
The need for an id field places a minimum size on a fragment. For
example possible fragment sizes are 2K, 3K, 4K, 5K ... for an 800K
floppy, or 12K, 13K, 14K, 15K ... for a 20 Mb hard disc with a ten
zone 2.5K map. Thus a compact map representation (down to 1 bit per
cluster if necessary) has been achieved by making it impossible to
represent inefficient allocations, (ie those with small fragments).
In order to avoid wasting disc space for small files, when a file or directory
does not use the whole of its allocated fragment and the fragment is too small
too split sharing is allowed. Directories can share fragments with their
sub files and files in the same directory can share a fragment. Sharing is at
sector rather than cluster resolution, saving disc space. This also improves
performance, firstly because head movement is reduced since small files are
closer to each other and their parent directory on average, and secondly the
map does not always need to be modified for small files.
The SIN of a file is a 24 bit value. Bits 0-7 are the sector offset+1 in the
fragment if the file may share a fragment or 0 if not. Bits 8-23 are the file
number.
ALLOCATION STRATEGIES
---------------------
Directories are allocated from the middle zone outwards and files
produced by whole file operations are allocated from the zone
containing their parent outwards. For hard discs as well as keeping
files in the same directory close to each other and their parent
directory, it also tends to keep small files near the middle of the
disc and large files further out. Both these reduce head movement.
When choosing disc space on openout it is put at the start of a zone
which balances distance from parent directory with amount of free
space in the zone.
As all allocation is indirected through the map it is possible to
re-allocate to reduce fragmentation without having to read or write
directories and a zone compaction routine is available for use by the
allocation routines. This is highly optimised both in the choice of moves (eg
it can spot fragments of the same file that can be joined) and in the
execution of the moves. It builds up lists of moves which it cancels,
combines, joins, splits, collects together in groups to be done together, and
sorts into an order that reduces head movement for the scatter read/write
primitives of the device drivers. Small compactions happen in response to
particular allocation needs and opportunities rather than compacting a whole
zone or disc at once so it not usually apparent to the user.
Files produced by whole file operations (eg SAVE) tend to be longer lived
than those produced by partial file operations (eg OPENOUT, GBPB). So if
they cannot be allocated a single extent compaction continues until a free
space of the correct size or data totalling twice the length of the file has
been moved. If compaction fails the file is allocated either a pair of
extents which totals the correct size or a set of adjacent free spaces
(possibly with an extent removed or added to give a better fit). The only
partial file operations that do compaction are open and close, and then only
if a very good opportunity is found. As files produced by whole file ops are
almost always allocated a single extent, and long lived files produced by
partial file ops tend to re-assemble their fragments as compactions happen,
seeking between extents of a file is greatly reduced.
MINOR DETAILS -------------
The checksum on a zone is calculated/checked as below
;entry
; R0 -> start
; R1 zone length
;exit
; LR check byte, Z=0 <=> good
NewCheck ROUT
Push "R1,R2,LR"
MOV LR, #0
ADDS R1, R1, R0 ;C=0
loop
LDR R2, [R1, #-4] !
ADCS LR, LR, R2
TEQS R1, R0 ;preserves C
BNE loop
AND R2, R2, #&FF ;ignore old sum
SUB LR, LR, R2
EOR LR, LR, LR, LSR #16
EOR LR, LR, LR, LSR #8
AND LR, LR, #&FF
CMPS R2, LR
Pull "R1,R2,PC"
The checksum on directories remains the same. But the validation string for a
directory has changed from 'Hugo' to 'Nick'
; Directory Start
^ 0
StartMasSeq # 1
StartName # 4
DirFirstEntry # 0
; Old Directory End
^ 0
# -1
DirCheckByte # 0 ;RETRO DEFINITION was reserved
# -4
EndName # 0
# -1
EndMasSeq # 0
# -14 ;reserved
DirTitleSz * 19
# -DirTitleSz
OldDirTitle # 0
# -3
OldDirParent # 0
# -NameLen
OldDirName # 0
# -1
OldDirLastMark # 0 ;dummy last entry marker
; New Directory End
^ 0
# -1
ASSERT DirCheckByte=@
# -4
ASSERT EndName=@
# -1
ASSERT EndMasSeq=@
# -NameLen
NewDirName # 0
# -DirTitleSz
NewDirTitle # 0
# -3
NewDirParent # 0
# -1 ;reserved
# -1 ;reserved
# -1
NewDirLastMark # 0 ;dummy last entry marker
; ================
; TestDirCheckByte
; ================
; entry
; R3 ind disc add of dir
; R5 -> dir start
; R6 -> dir end
; exit
; LR check byte
; Z clear if matches existing byte
TestDirCheckByte
Push "R0-R2,R5,R7,LR"
BL EndDirEntries ;(R3,R5,R6->R0)
BL TestDir ;(R3->LR,Z)
ADDEQ R7, R6, #NewDirLastMark+1 ;dir tail
ADDNE R7, R6, #OldDirLastMark+1
05
BIC R1, R0, #3
MOV R2, #0
10 ;whole words before end of entries
LDR LR, [R5],#4
EOR R2, LR, R2,ROR #13
TEQS R5, R1
BNE %BT10
20 ;odd bytes before end of entries
LDRNEB LR, [R5], #1 ;not first pass through loop
EORNE R2, LR, R2,ROR #13
TEQS R5, R0
BNE %BT20
MOV R5, R7
30 ;odd bytes before at start of tail
TSTS R5, #3
LDRNEB LR, [R5],#1
EORNE R2, LR, R2, ROR #13
BNE %BT30
ASSERT DirCheckByte=-1 ;dont include last word as it contains
SUB R1, R6, #4 ;the check byte
40 ;whole words in tail
LDR LR, [R5],#4
EOR R2, LR, R2,ROR #13
TEQS R5, R1
BNE %BT40
EOR R2, R2, R2, LSR #16 ;reduce to 8 bits
EOR R2, R2, R2, LSR #8
AND R2, R2, #&FF
LDRB LR, [R6,#DirCheckByte] ;compare with old check byte
TEQS R2, LR
MOV LR, R2
Pull "R0-R2,R5,R7,PC"
; =============
; EndDirEntries
; =============
; Find the end of the list of entries in a dir
; entry
; R3 ind disc add of dir
; R5 -> dir start
; R6 -> dir end
; exit
; R0 -> first empty entry
EndDirEntries ROUT
Push "R2,LR"
BL TestDir ;(R3->LR,Z)
ADDEQ R2, R6, #NewDirLastMark
ADDNE R2, R6, #OldDirLastMark
ADD R0, R5, #DirFirstEntry
SUB R0, R0, #DirEntrySz
10 ;loop examining entries
LDRB LR, [R0,#DirEntrySz] !
CMPS LR, #0 ;until null entry
CMPNES R0, R2 ;or exhausted
BLO %BT10
MOVHI R0, R2
Pull "R2,PC",,^